Overview

Dataset statistics

Number of variables12
Number of observations10738
Missing cells251
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1006.8 KiB
Average record size in memory96.0 B

Variable types

Categorical4
Numeric8

Alerts

customer_id has a high cardinality: 10738 distinct values High cardinality
customer_product_search_score is highly correlated with customer_stay_scoreHigh correlation
customer_ctr_score is highly correlated with customer_stay_score and 1 other fieldsHigh correlation
customer_stay_score is highly correlated with customer_product_search_score and 1 other fieldsHigh correlation
customer_frequency_score is highly correlated with customer_product_variation_score and 2 other fieldsHigh correlation
customer_product_variation_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_order_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_affinity_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_category is highly correlated with customer_ctr_scoreHigh correlation
customer_visit_score is highly correlated with customer_ctr_scoreHigh correlation
customer_ctr_score is highly correlated with customer_visit_score and 2 other fieldsHigh correlation
customer_stay_score is highly correlated with customer_ctr_score and 1 other fieldsHigh correlation
customer_frequency_score is highly correlated with customer_product_variation_score and 1 other fieldsHigh correlation
customer_product_variation_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_order_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_affinity_score is highly correlated with customer_product_variation_score and 1 other fieldsHigh correlation
customer_category is highly correlated with customer_ctr_score and 1 other fieldsHigh correlation
customer_ctr_score is highly correlated with customer_stay_scoreHigh correlation
customer_stay_score is highly correlated with customer_ctr_scoreHigh correlation
customer_frequency_score is highly correlated with customer_product_variation_score and 2 other fieldsHigh correlation
customer_product_variation_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_order_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_affinity_score is highly correlated with customer_frequency_score and 2 other fieldsHigh correlation
customer_visit_score is highly correlated with customer_ctr_score and 3 other fieldsHigh correlation
customer_product_search_score is highly correlated with customer_ctr_score and 1 other fieldsHigh correlation
customer_ctr_score is highly correlated with customer_visit_score and 7 other fieldsHigh correlation
customer_stay_score is highly correlated with customer_visit_score and 6 other fieldsHigh correlation
customer_frequency_score is highly correlated with customer_ctr_score and 4 other fieldsHigh correlation
customer_product_variation_score is highly correlated with customer_ctr_score and 5 other fieldsHigh correlation
customer_order_score is highly correlated with customer_ctr_score and 5 other fieldsHigh correlation
customer_affinity_score is highly correlated with customer_order_score and 1 other fieldsHigh correlation
customer_active_segment is highly correlated with customer_visit_score and 2 other fieldsHigh correlation
X1 is highly correlated with customer_product_variation_score and 2 other fieldsHigh correlation
customer_category is highly correlated with customer_visit_score and 5 other fieldsHigh correlation
customer_id is uniformly distributed Uniform
customer_id has unique values Unique
customer_visit_score has unique values Unique
customer_ctr_score has unique values Unique
customer_frequency_score has unique values Unique
customer_affinity_score has unique values Unique

Reproduction

Analysis started2022-01-25 02:46:38.886903
Analysis finished2022-01-25 02:47:06.075853
Duration27.19 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

customer_id
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct10738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size84.0 KiB
csid_1
 
1
csid_7153
 
1
csid_7155
 
1
csid_7156
 
1
csid_7157
 
1
Other values (10733)
10733 

Length

Max length10
Median length9
Mean length8.965729186
Min length6

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10738 ?
Unique (%)100.0%

Sample

1st rowcsid_1
2nd rowcsid_2
3rd rowcsid_3
4th rowcsid_4
5th rowcsid_5

Common Values

ValueCountFrequency (%)
csid_11
 
< 0.1%
csid_71531
 
< 0.1%
csid_71551
 
< 0.1%
csid_71561
 
< 0.1%
csid_71571
 
< 0.1%
csid_71581
 
< 0.1%
csid_71591
 
< 0.1%
csid_71601
 
< 0.1%
csid_71611
 
< 0.1%
csid_71621
 
< 0.1%
Other values (10728)10728
99.9%

Length

2022-01-25T08:17:06.422923image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
csid_11
 
< 0.1%
csid_231
 
< 0.1%
csid_421
 
< 0.1%
csid_201
 
< 0.1%
csid_31
 
< 0.1%
csid_41
 
< 0.1%
csid_51
 
< 0.1%
csid_61
 
< 0.1%
csid_71
 
< 0.1%
csid_81
 
< 0.1%
Other values (10728)10728
99.9%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

customer_visit_score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct10738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.06094129
Minimum0.5689647667
Maximum47.30669098
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.0 KiB
2022-01-25T08:17:06.706018image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.5689647667
5-th percentile7.442481958
Q113.51802134
median18.77410921
Q324.50171939
95-th percentile31.42445376
Maximum47.30669098
Range46.73772622
Interquartile range (IQR)10.98369805

Descriptive statistics

Standard deviation7.419609076
Coefficient of variation (CV)0.389257223
Kurtosis-0.4065214262
Mean19.06094129
Median Absolute Deviation (MAD)5.439947497
Skewness0.1014477924
Sum204676.3876
Variance55.05059884
MonotonicityNot monotonic
2022-01-25T08:17:06.946210image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13.168424931
 
< 0.1%
31.589454481
 
< 0.1%
14.75939681
 
< 0.1%
9.1361713351
 
< 0.1%
3.7905942191
 
< 0.1%
20.373540991
 
< 0.1%
24.39064171
 
< 0.1%
13.519195381
 
< 0.1%
29.633977061
 
< 0.1%
18.94732511
 
< 0.1%
Other values (10728)10728
99.9%
ValueCountFrequency (%)
0.56896476671
< 0.1%
0.64418068551
< 0.1%
0.66505347171
< 0.1%
0.7152155171
< 0.1%
0.91862684391
< 0.1%
0.94184608341
< 0.1%
0.97978938761
< 0.1%
0.99572086681
< 0.1%
1.0454095911
< 0.1%
1.0628485331
< 0.1%
ValueCountFrequency (%)
47.306690981
< 0.1%
43.926748331
< 0.1%
43.757269821
< 0.1%
42.342567411
< 0.1%
42.194958251
< 0.1%
41.976003831
< 0.1%
41.006086161
< 0.1%
40.671150791
< 0.1%
40.572849521
< 0.1%
40.38733911
< 0.1%

customer_product_search_score
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION

Distinct10696
Distinct (%)100.0%
Missing42
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean5.274847153
Minimum-0.1619399818
Maximum16.6382433
Zeros0
Zeros (%)0.0%
Negative2
Negative (%)< 0.1%
Memory size84.0 KiB
2022-01-25T08:17:07.144855image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-0.1619399818
5-th percentile2.262566301
Q13.971586843
median5.218479286
Q36.520363539
95-th percentile8.386104872
Maximum16.6382433
Range16.80018328
Interquartile range (IQR)2.548776696

Descriptive statistics

Standard deviation1.882558586
Coefficient of variation (CV)0.3568934855
Kurtosis0.545163275
Mean5.274847153
Median Absolute Deviation (MAD)1.276309404
Skewness0.2892716474
Sum56419.76515
Variance3.54402683
MonotonicityNot monotonic
2022-01-25T08:17:07.333473image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9.4476616911
 
< 0.1%
4.7384166081
 
< 0.1%
7.1491314191
 
< 0.1%
6.4678309731
 
< 0.1%
4.4272392531
 
< 0.1%
6.1417536941
 
< 0.1%
5.7193436291
 
< 0.1%
2.3416635081
 
< 0.1%
3.3546256481
 
< 0.1%
5.2327487271
 
< 0.1%
Other values (10686)10686
99.5%
(Missing)42
 
0.4%
ValueCountFrequency (%)
-0.16193998181
< 0.1%
-0.048757130641
< 0.1%
0.06443449741
< 0.1%
0.087830446421
< 0.1%
0.17588180481
< 0.1%
0.21993724221
< 0.1%
0.27361374691
< 0.1%
0.27556274361
< 0.1%
0.35164623141
< 0.1%
0.39613149611
< 0.1%
ValueCountFrequency (%)
16.63824331
< 0.1%
16.630886641
< 0.1%
15.519324081
< 0.1%
14.653194951
< 0.1%
14.649868371
< 0.1%
14.601042891
< 0.1%
14.484268091
< 0.1%
14.345625581
< 0.1%
13.602085641
< 0.1%
13.47462321
< 0.1%

customer_ctr_score
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct10738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1759119806
Minimum-0.5479890838
Maximum2.679474242
Zeros0
Zeros (%)0.0%
Negative2258
Negative (%)21.0%
Memory size84.0 KiB
2022-01-25T08:17:07.553005image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-0.5479890838
5-th percentile-0.07250584954
Q10.01084001214
median0.07407813627
Q30.1596064355
95-th percentile1.072821684
Maximum2.679474242
Range3.227463326
Interquartile range (IQR)0.1487664233

Descriptive statistics

Standard deviation0.3728289383
Coefficient of variation (CV)2.119406177
Kurtosis10.96033251
Mean0.1759119806
Median Absolute Deviation (MAD)0.07104629897
Skewness3.216021049
Sum1888.942848
Variance0.1390014172
MonotonicityNot monotonic
2022-01-25T08:17:07.733001image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.070202642891
 
< 0.1%
0.00058982128061
 
< 0.1%
0.043417437821
 
< 0.1%
0.080462015351
 
< 0.1%
1.1773694071
 
< 0.1%
0.060615557391
 
< 0.1%
0.071012359661
 
< 0.1%
-0.086142765171
 
< 0.1%
0.015644434621
 
< 0.1%
0.043969367881
 
< 0.1%
Other values (10728)10728
99.9%
ValueCountFrequency (%)
-0.54798908381
< 0.1%
-0.54622746311
< 0.1%
-0.53846832881
< 0.1%
-0.53394142371
< 0.1%
-0.53248570931
< 0.1%
-0.49010188341
< 0.1%
-0.48659590461
< 0.1%
-0.48113388771
< 0.1%
-0.48022836041
< 0.1%
-0.47979680871
< 0.1%
ValueCountFrequency (%)
2.6794742421
< 0.1%
2.5712384131
< 0.1%
2.570438641
< 0.1%
2.5104065471
< 0.1%
2.3909430971
< 0.1%
2.389598971
< 0.1%
2.3846032311
< 0.1%
2.3834948421
< 0.1%
2.3754613261
< 0.1%
2.3606806431
< 0.1%

customer_stay_score
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10701
Distinct (%)100.0%
Missing37
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean0.3742300618
Minimum-0.4624940639
Maximum14.70191417
Zeros0
Zeros (%)0.0%
Negative3962
Negative (%)36.9%
Memory size84.0 KiB
2022-01-25T08:17:07.934841image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-0.4624940639
5-th percentile-0.1002052956
Q1-0.02766573337
median0.03720079496
Q30.1790287653
95-th percentile2.441966972
Maximum14.70191417
Range15.16440824
Interquartile range (IQR)0.2066944986

Descriptive statistics

Standard deviation1.222030798
Coefficient of variation (CV)3.265453321
Kurtosis29.79532391
Mean0.3742300618
Median Absolute Deviation (MAD)0.08274564002
Skewness5.008726307
Sum4004.635892
Variance1.493359272
MonotonicityNot monotonic
2022-01-25T08:17:08.148698image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-0.13954087931
 
< 0.1%
1.4076133581
 
< 0.1%
-0.0036484905581
 
< 0.1%
-5.021905798 × 10-51
 
< 0.1%
-0.052408240151
 
< 0.1%
0.047235018641
 
< 0.1%
0.13125837481
 
< 0.1%
-0.0047386661481
 
< 0.1%
-0.013846442271
 
< 0.1%
0.0019548725951
 
< 0.1%
Other values (10691)10691
99.6%
(Missing)37
 
0.3%
ValueCountFrequency (%)
-0.46249406391
< 0.1%
-0.38951742781
< 0.1%
-0.37248083551
< 0.1%
-0.35325602541
< 0.1%
-0.34927700741
< 0.1%
-0.34587722371
< 0.1%
-0.33291270871
< 0.1%
-0.33033147441
< 0.1%
-0.31715025341
< 0.1%
-0.31574170191
< 0.1%
ValueCountFrequency (%)
14.701914171
< 0.1%
14.281132871
< 0.1%
13.539720371
< 0.1%
12.408497331
< 0.1%
11.993541741
< 0.1%
11.934596461
< 0.1%
11.654007871
< 0.1%
11.590710551
< 0.1%
11.552089961
< 0.1%
11.4145811
< 0.1%

customer_frequency_score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct10738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.376894688
Minimum0.02857521051
Maximum52.39501392
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.0 KiB
2022-01-25T08:17:08.345008image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.02857521051
5-th percentile0.1542461741
Q10.3136096545
median0.5168299359
Q31.125379515
95-th percentile14.11275303
Maximum52.39501392
Range52.36643871
Interquartile range (IQR)0.8117698602

Descriptive statistics

Standard deviation5.601910934
Coefficient of variation (CV)2.356819157
Kurtosis19.19443114
Mean2.376894688
Median Absolute Deviation (MAD)0.2615149293
Skewness4.083012882
Sum25523.09516
Variance31.38140612
MonotonicityNot monotonic
2022-01-25T08:17:08.568030image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.43695603081
 
< 0.1%
0.35314797191
 
< 0.1%
20.26677771
 
< 0.1%
1.9336992161
 
< 0.1%
34.824089721
 
< 0.1%
0.5038569121
 
< 0.1%
13.63201221
 
< 0.1%
12.177782561
 
< 0.1%
5.6187366721
 
< 0.1%
0.54566673291
 
< 0.1%
Other values (10728)10728
99.9%
ValueCountFrequency (%)
0.028575210511
< 0.1%
0.033320081341
< 0.1%
0.03559023141
< 0.1%
0.035911515071
< 0.1%
0.036605369411
< 0.1%
0.040707908651
< 0.1%
0.041309515881
< 0.1%
0.043242173651
< 0.1%
0.04616605831
< 0.1%
0.046685372411
< 0.1%
ValueCountFrequency (%)
52.395013921
< 0.1%
49.679380011
< 0.1%
49.034194641
< 0.1%
47.816850081
< 0.1%
46.921309031
< 0.1%
46.759715851
< 0.1%
46.702995021
< 0.1%
46.446525151
< 0.1%
46.122040141
< 0.1%
45.622667541
< 0.1%

customer_product_variation_score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10692
Distinct (%)100.0%
Missing46
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean5.788179529
Minimum2.752836148
Maximum18.74383572
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.0 KiB
2022-01-25T08:17:08.789049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum2.752836148
5-th percentile3.541244833
Q14.193234472
median4.842574595
Q36.286400327
95-th percentile11.66568404
Maximum18.74383572
Range15.99099957
Interquartile range (IQR)2.093165855

Descriptive statistics

Standard deviation2.531309458
Coefficient of variation (CV)0.4373239367
Kurtosis3.191873393
Mean5.788179529
Median Absolute Deviation (MAD)0.8261130491
Skewness1.851646948
Sum61887.21552
Variance6.40752757
MonotonicityNot monotonic
2022-01-25T08:17:09.016029image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.7057609391
 
< 0.1%
5.3066861051
 
< 0.1%
10.69256681
 
< 0.1%
9.4998333051
 
< 0.1%
10.24829781
 
< 0.1%
5.3441862211
 
< 0.1%
10.216674221
 
< 0.1%
13.016919521
 
< 0.1%
12.697850881
 
< 0.1%
10.522075971
 
< 0.1%
Other values (10682)10682
99.5%
(Missing)46
 
0.4%
ValueCountFrequency (%)
2.7528361481
< 0.1%
2.7878798791
< 0.1%
2.8122955981
< 0.1%
2.8217700781
< 0.1%
2.8229258881
< 0.1%
2.8530358721
< 0.1%
2.8609197141
< 0.1%
2.8880287611
< 0.1%
2.8906144431
< 0.1%
2.9058748691
< 0.1%
ValueCountFrequency (%)
18.743835721
< 0.1%
18.487907531
< 0.1%
18.429711691
< 0.1%
18.34186881
< 0.1%
18.266781141
< 0.1%
18.048711081
< 0.1%
17.675440281
< 0.1%
17.649264371
< 0.1%
17.332214611
< 0.1%
17.328954381
< 0.1%

customer_order_score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10672
Distinct (%)100.0%
Missing66
Missing (%)0.6%
Infinite0
Infinite (%)0.0%
Mean3.150070539
Minimum0.3633379501
Maximum9.090205509
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size84.0 KiB
2022-01-25T08:17:09.238557image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum0.3633379501
5-th percentile1.532564209
Q12.454017385
median3.118394172
Q33.756566397
95-th percentile4.892306349
Maximum9.090205509
Range8.726867559
Interquartile range (IQR)1.302549012

Descriptive statistics

Standard deviation1.03541551
Coefficient of variation (CV)0.3286959759
Kurtosis1.210741347
Mean3.150070539
Median Absolute Deviation (MAD)0.6528132847
Skewness0.5768648974
Sum33617.55279
Variance1.072085278
MonotonicityNot monotonic
2022-01-25T08:17:09.423132image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2.5379850521
 
< 0.1%
3.1220502431
 
< 0.1%
1.5436324691
 
< 0.1%
1.8510685451
 
< 0.1%
1.3900367321
 
< 0.1%
3.3017536181
 
< 0.1%
2.0475549511
 
< 0.1%
2.2313781991
 
< 0.1%
2.8681319621
 
< 0.1%
1.9234364131
 
< 0.1%
Other values (10662)10662
99.3%
(Missing)66
 
0.6%
ValueCountFrequency (%)
0.36333795011
< 0.1%
0.53713675271
< 0.1%
0.56107239091
< 0.1%
0.56927971771
< 0.1%
0.59975461641
< 0.1%
0.60945121561
< 0.1%
0.65118935561
< 0.1%
0.7131926211
< 0.1%
0.72127578811
< 0.1%
0.72389116271
< 0.1%
ValueCountFrequency (%)
9.0902055091
< 0.1%
8.9519387481
< 0.1%
8.9376198611
< 0.1%
8.3573905231
< 0.1%
8.2262494691
< 0.1%
8.1702250521
< 0.1%
8.1017840311
< 0.1%
8.0618360271
< 0.1%
8.0462571671
< 0.1%
8.0103509961
< 0.1%

customer_affinity_score
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
UNIQUE

Distinct10738
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.06183579
Minimum-0.4868340563
Maximum248.5527547
Zeros0
Zeros (%)0.0%
Negative718
Negative (%)6.7%
Memory size84.0 KiB
2022-01-25T08:17:09.645222image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Quantile statistics

Minimum-0.4868340563
5-th percentile-0.08343869615
Q14.530085389
median12.65335707
Q323.11457668
95-th percentile50.46467763
Maximum248.5527547
Range249.0395888
Interquartile range (IQR)18.58449129

Descriptive statistics

Standard deviation18.76269336
Coefficient of variation (CV)1.099687841
Kurtosis16.85754965
Mean17.06183579
Median Absolute Deviation (MAD)8.987608416
Skewness2.993483837
Sum183209.9927
Variance352.0386622
MonotonicityNot monotonic
2022-01-25T08:17:09.847262image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7.9595032211
 
< 0.1%
5.1747438121
 
< 0.1%
-0.40333990611
 
< 0.1%
1.322511841
 
< 0.1%
-0.090109041131
 
< 0.1%
12.8704841
 
< 0.1%
-0.15712655631
 
< 0.1%
-0.26603337161
 
< 0.1%
-0.23238876051
 
< 0.1%
7.6188398391
 
< 0.1%
Other values (10728)10728
99.9%
ValueCountFrequency (%)
-0.48683405631
< 0.1%
-0.4828940961
< 0.1%
-0.4733285271
< 0.1%
-0.45562533661
< 0.1%
-0.45429758231
< 0.1%
-0.43793946881
< 0.1%
-0.43056679911
< 0.1%
-0.42913980771
< 0.1%
-0.41997592361
< 0.1%
-0.41869639951
< 0.1%
ValueCountFrequency (%)
248.55275471
< 0.1%
246.93696551
< 0.1%
218.45877021
< 0.1%
206.66972831
< 0.1%
198.9232641
< 0.1%
197.32329061
< 0.1%
182.22448161
< 0.1%
173.20612111
< 0.1%
167.62462711
< 0.1%
165.09663051
< 0.1%

customer_active_segment
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing23
Missing (%)0.2%
Memory size84.0 KiB
C
4919 
B
4430 
D
536 
AA
 
418
A
 
412

Length

Max length2
Median length1
Mean length1.039010733
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC
2nd rowC
3rd rowC
4th rowAA
5th rowC

Common Values

ValueCountFrequency (%)
C4919
45.8%
B4430
41.3%
D536
 
5.0%
AA418
 
3.9%
A412
 
3.8%
(Missing)23
 
0.2%

Length

2022-01-25T08:17:10.015643image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-25T08:17:10.224221image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
c4919
45.9%
b4430
41.3%
d536
 
5.0%
aa418
 
3.9%
a412
 
3.8%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

X1
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing37
Missing (%)0.3%
Memory size84.0 KiB
BA
4511 
A
2268 
F
2235 
AA
1611 
E
 
76

Length

Max length2
Median length2
Mean length1.572096066
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowA
3rd rowBA
4th rowF
5th rowAA

Common Values

ValueCountFrequency (%)
BA4511
42.0%
A2268
21.1%
F2235
20.8%
AA1611
 
15.0%
E76
 
0.7%
(Missing)37
 
0.3%

Length

2022-01-25T08:17:10.350117image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-25T08:17:10.493567image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
ba4511
42.2%
a2268
21.2%
f2235
20.9%
aa1611
 
15.1%
e76
 
0.7%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

customer_category
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size84.0 KiB
0
9443 
1
1295 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters0
Distinct characters0
Distinct categories0 ?
Distinct scripts0 ?
Distinct blocks0 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
09443
87.9%
11295
 
12.1%

Length

2022-01-25T08:17:10.622115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2022-01-25T08:17:10.743254image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
ValueCountFrequency (%)
09443
87.9%
11295
 
12.1%

Most occurring characters

ValueCountFrequency (%)
No values found.

Most occurring categories

ValueCountFrequency (%)
No values found.

Most frequent character per category

Most occurring scripts

ValueCountFrequency (%)
No values found.

Most frequent character per script

Most occurring blocks

ValueCountFrequency (%)
No values found.

Most frequent character per block

Interactions

2022-01-25T08:17:02.140758image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:50.256120image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:51.915938image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:53.506330image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:55.178035image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:57.429477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.980819image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:00.577034image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:02.334983image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:50.563470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:52.098696image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:53.714160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:55.904844image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:57.633100image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:59.182278image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:00.765470image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:02.555452image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:50.757720image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:52.307761image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:53.925014image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:56.096252image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:57.823390image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:59.379352image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:00.998905image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:02.764403image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:50.957417image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:52.520266image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:54.130316image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:56.307115image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.022913image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:59.592185image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:01.192677image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:02.957350image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:51.159333image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:52.736065image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:54.342160image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:56.542638image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.209776image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:59.786594image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:01.385043image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:03.156287image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:51.344296image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:52.920852image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:54.541049image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:56.793258image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.396649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:59.974686image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:01.571244image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:03.356701image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:51.550927image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:53.123659image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:54.782066image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:57.038490image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.613308image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:00.185413image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:01.781886image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:03.585124image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:51.736330image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:53.312583image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:54.995317image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:57.227882image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:16:58.792799image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:00.369477image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
2022-01-25T08:17:01.958748image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Correlations

2022-01-25T08:17:10.865879image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-01-25T08:17:11.114899image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-01-25T08:17:11.366482image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-01-25T08:17:12.194328image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-01-25T08:17:12.382205image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-01-25T08:17:04.005649image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
A simple visualization of nullity by column.
2022-01-25T08:17:04.527920image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2022-01-25T08:17:05.634024image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2022-01-25T08:17:05.883162image/svg+xmlMatplotlib v3.5.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

customer_idcustomer_visit_scorecustomer_product_search_scorecustomer_ctr_scorecustomer_stay_scorecustomer_frequency_scorecustomer_product_variation_scorecustomer_order_scorecustomer_affinity_scorecustomer_active_segmentX1customer_category
0csid_113.1684259.447662-0.070203-0.1395410.4369564.7057612.5379857.959503CF0
1csid_217.0929797.3290560.153298-0.1027260.3803404.2051384.19344417.517381CA0
2csid_317.5053345.1436760.1067090.2628340.4176484.4790703.87897112.595155CBA0
3csid_431.4233814.917740-0.020226-0.1005260.7781305.0555352.7089404.795073AAF0
4csid_511.9095024.2370730.1871780.1728910.1620673.4452473.67736056.636326CAA0
5csid_69.0079227.0515680.1615640.0409970.1919354.2098403.18196118.862680CBA0
6csid_713.7071095.6251790.009634-0.0199980.1776224.1650934.689834109.203352BE0
7csid_832.0421223.563568-0.050730NaN0.2570604.3667614.04126024.036321AAA0
8csid_920.4341815.1116820.1339220.0368930.4423144.7595163.40742417.078123CBA0
9csid_1013.7782143.8292990.1591020.1658180.5581876.2559803.3154629.443864BBA0

Last rows

customer_idcustomer_visit_scorecustomer_product_search_scorecustomer_ctr_scorecustomer_stay_scorecustomer_frequency_scorecustomer_product_variation_scorecustomer_order_scorecustomer_affinity_scorecustomer_active_segmentX1customer_category
10728csid_1072924.7723174.7532380.019578-0.0700970.5562644.5900203.12614512.193862CA0
10729csid_1073011.6574556.2333530.007517-0.0161220.4767004.0246552.72774022.214286BA0
10730csid_1073118.7938874.1998260.1437470.2195250.2673843.8677312.89310628.685574BBA0
10731csid_1073229.0941676.391500-0.051283-0.0797430.4348654.7919492.2445126.251333BBA0
10732csid_1073314.6640365.3418110.043920-0.1250900.2690194.5630343.68517614.066261CA0
10733csid_1073423.6726156.7015140.092879-0.0173321.2103977.0036633.0270841.952911CBA0
10734csid_1073525.6730286.4977960.050216-0.0472110.7252305.4075073.1041725.124286CBA0
10735csid_1073631.6768447.7998800.062961-0.0327650.3181185.5984862.40305121.864188ABA0
10736csid_1073728.4417805.588302-0.0939310.0815860.1321773.6164924.97224386.969977BAA0
10737csid_1073820.6630354.4783010.2531650.3813490.5049044.1810924.46921527.770899BA0